65 research outputs found

    A multiple layer model to compare RNA secondary structures

    Get PDF
    International audienceWe formally introduce a new data structure, called MiGaL for ``Multiple Graph Layers'', that is composed of various graphs linked together by relations of abstraction/refinement. The new structure is useful for representing information that can be described at different levels of abstraction, each level corresponding to a graph. We then propose an algorithm for comparing two MiGaLs. The algorithm performs a step-by-step comparison starting with the most ``abstract'' level. The result of the comparison at a given step is communicated to the next step using a special colouring scheme. MiGaLs represent a very natural model for comparing RNA secondary structures that may be seen at different levels of detail, going from the sequence of nucleotides, single or paired with another to participate in a helix, to the network of multiple loops that is believed to represent the most conserved part of RNAs having similar function. We therefore show how to use MiGaLs to very efficiently compare two RNAs of any size at different levels of detail

    BRASERO: A Resource for Benchmarking RNA Secondary Structure Comparison Algorithms

    Get PDF
    The pairwise comparison of RNA secondary structures is a fundamental problem, with direct application in mining databases for annotating putative noncoding RNA candidates in newly sequenced genomes. An increasing number of software tools are available for comparing RNA secondary structures, based on different models (such as ordered trees or forests, arc annotated sequences, and multilevel trees) and computational principles (edit distance, alignment). We describe here the website BRASERO that offers tools for evaluating such software tools on real and synthetic datasets

    The Gapped-Factor Tree

    Get PDF
    International audienceWe present a data structure to index a specific kind of factors, that is of substrings, called gapped-factors. A gapped-factor is a factor containing a gap that is ignored during the indexation. The data structure presented is based on the suffix tree and indexes all the gapped-factors of a text with a fixed size of gap, and only those. The construction of this data structure is done online in linear time and space. Such a data structure may play an important role in various pattern matching and motif inference problems, for instance in text filtration

    The pandemic toll and post-acute sequelae of SARS-CoV-2 in healthcare workers at a Swiss University Hospital.

    Get PDF
    Healthcare workers have potentially been among the most exposed to SARS-CoV-2 infection as well as the deleterious toll of the pandemic. This study has the objective to differentiate the pandemic toll from post-acute sequelae of SARS-CoV-2 infection in healthcare workers compared to the general population. The study was conducted between April and July 2021 at the Geneva University Hospitals, Switzerland. Eligible participants were all tested staff, and outpatient individuals tested for SARS-CoV-2 at the same hospital. The primary outcome was the prevalence of symptoms in healthcare workers compared to the general population, with measures of COVID-related symptoms and functional impairment, using prevalence estimates and multivariable logistic regression models. Healthcare workers (n=3,083) suffered mostly from fatigue (25.5%), headache (10.0%), difficulty concentrating (7.9%), exhaustion/burnout (7.1%), insomnia (6.2%), myalgia (6.7%) and arthralgia (6.3%). Regardless of SARS-CoV-2 infection, all symptoms were significantly higher in healthcare workers than the general population (n=3,556). SARS-CoV-2 infection in healthcare workers was associated with loss or change in smell, loss or change in taste, palpitations, dyspnea, difficulty concentrating, fatigue, and headache. Functional impairment was more significant in healthcare workers compared to the general population (aOR 2.28; 1.76-2.96), with a positive association with SARS-CoV-2 infection (aOR 3.81; 2.59-5.60). Symptoms and functional impairment in healthcare workers were increased compared to the general population, and potentially related to the pandemic toll as well as post-acute sequelae of SARS-CoV-2 infection. These findings are of concern, considering the essential role of healthcare workers in caring for all patients including and beyond COVID-19

    Structural Stability of Human Protein Tyrosine Phosphatase ρ Catalytic Domain: Effect of Point Mutations

    Get PDF
    Protein tyrosine phosphatase ρ (PTPρ) belongs to the classical receptor type IIB family of protein tyrosine phosphatase, the most frequently mutated tyrosine phosphatase in human cancer. There are evidences to suggest that PTPρ may act as a tumor suppressor gene and dysregulation of Tyr phosphorylation can be observed in diverse diseases, such as diabetes, immune deficiencies and cancer. PTPρ variants in the catalytic domain have been identified in cancer tissues. These natural variants are nonsynonymous single nucleotide polymorphisms, variations of a single nucleotide occurring in the coding region and leading to amino acid substitutions. In this study we investigated the effect of amino acid substitution on the structural stability and on the activity of the membrane-proximal catalytic domain of PTPρ. We expressed and purified as soluble recombinant proteins some of the mutants of the membrane-proximal catalytic domain of PTPρ identified in colorectal cancer and in the single nucleotide polymorphisms database. The mutants show a decreased thermal and thermodynamic stability and decreased activation energy relative to phosphatase activity, when compared to wild- type. All the variants show three-state equilibrium unfolding transitions similar to that of the wild- type, with the accumulation of a folding intermediate populated at ∌4.0 M urea

    Comparaison de structures secondaires d'ARN

    Get PDF
    RNAs are one of the fundamental elements of a cell. Generally, RNAs are defined as oriented sequences of nucleotides (denoted by A,C,G and U). Inside a cell, RNAs do not have a linear shape but fold in space. The molecular function of an RNA strongly depends on this tri-dimensional folding. Hence, the comparison of the tri-dimensional structure of two RNAs is essential to determine whether the RNAs share the same function. The structure of an RNA is generally divided into three parts. The first is the primary structure which corresponds to the sequence of nucleotides. The secondary structure is composed of the list of links between nucleotides that represent helices. Finally, the tertiary structure corresponds to the exact tri-dimensional folding of the RNA. Although the tertiary structure is the most accurate definition of the spatial structure adopted by an RNA, it is well-known that two RNAs sharing the same function will also have closely related secondary structures. A few other structural elements can be distinguished in an RNA secondary structure. These are the helices, multiloops, hairpin loops, internal loops and bulges. Up until now, essentially three data structures have been proposed to represent an RNA secondary structure : arc-annotated sequences, 2-intervals and rooted oriented trees. Arc-annotated sequences are sequences with arcs between nucleotides of the sequence that form a pair in the structure. 2-intervals generalise arc-annotated sequences and correspond to two disjoint subsets. An RNA secondary structure is then de?ned as a family of 2-intervals. Finally, rooted ordered trees can represent an RNA secondary structure at various levels, from the nucleotides up to the network of multiloops. One of the drawbacks of all these approaches is that they model the secondary structure of an RNA from a specific point of view (nucleotides, helices etc.). We decided to introduce a new model called RNA-MiGaL, made of four trees related among them. Each of these trees represents the structure of an RNA at a particular level of detail : the upper level models the network of multiloops that is considered as the skeleton of the secondary structure, while the lower level represents nucleotides. We use the tree edit distance to compare two RNA-MiGaLs. However, due to some limitations of the classical edit distance to compare trees representing RNA secondary structures, we introduced two new edit operations named "node fusion" and "edge fusion", thus providing a new edit distance. Using this distance, we developed an algorithm to compare two RNA-MiGaLs. The algorithm has been implemented in a package which allows RNA secondary structures to be compared in various ways.L'ARN, acide ribonuclĂ©ique, est un des composants fondamentaux de la cellule. La majoritĂ© des ARN sont constituĂ©s d'une sĂ©quence orientĂ©e de nuclĂ©otides notĂ©s A,C,G et U. Une telle sĂ©quence se replie dans l'espace en formant des liaisons entre les nuclĂ©otides deux Ă  deux. La fonction des ARN au sein de la cellule est liĂ©e Ă  la conformation spatiale qu'ils adoptent. Ainsi, il est essentiel de pouvoir comparer deux ARN au niveau de leur conformation, par exemple pour dĂ©terminer si deux ARN ont la mĂȘme fonction. On distingue trois niveaux dans la structure d'un ARN. La structure primaire correspond Ă  la sĂ©quence de nuclĂ©otides, la structure secondaire est constistuĂ©e de la liste des liaisons formĂ©es entre les nuclĂ©otides tandis que la structure tertiaire consiste en la description exacte de la forme tridimensionnelle de la molĂ©cule (coordonnĂ©es de chaque nuclĂ©otide). Bien que la structure tertiaire soit celle qui dĂ©crit le mieux la forme spatiale de l'ARN, il est admis que deux ARN ayant une fonction molĂ©culaire similaire ont une structure secondaire proche. Au niveau de la structure secondaire, une fois les liaisons nuclĂ©otidiques formĂ©es, on peut distinguer des Ă©lĂ©ments de structure secondaire telles que les hĂ©lices, les boucles multiples, les boucles terminales, les boucles internes et les renflements. Essentiellement deux formalismes ont Ă©tĂ© Ă  ce jour proposĂ©s pour modĂ©liser la structure secondaire des ARN. Les sĂ©quences annotĂ©es par des arcs permettent de reprĂ©senter la sĂ©quence de l'ARN, les arcs codant alors pour les liaisons entre les lettres (nuclĂ©otides de la sĂ©quence). Les 2-intervalles, gĂ©nĂ©ralisation des sĂ©quences annotĂ©es, sont formĂ©s par deux intervalles disjoints. La structure secondaire peut alors ĂȘtre vue comme une famille de 2-intervalles. Enfin, les arbres racinĂ©s ordonnĂ©s offrent de nombreuses possibilitĂ©s pour coder la structure secondaire, du niveau nuclĂ©otidique au niveau du rĂ©seau des boucles multiples. L'un des inconvĂ©nients de ces approches est qu'elles modĂ©lisent la structure secondaire de l'ARN selon un point de vue particulier (nuclĂ©otides, hĂ©lices etc). Nous proposons une nouvelle modĂ©lisation, appelĂ©e RNA-MiGaL, constituĂ©e de quatre arbres liĂ©s entre eux reprĂ©sentant la structure Ă  diffĂ©rents niveaux de prĂ©cision. Ainsi, le plus haut niveau code pour le rĂ©seau de boucles multiples considĂ©rĂ© comme le squelette de la molĂ©cule. Le dernier niveau quant Ă  lui dĂ©taille les nuclĂ©otides. Pour comparer de telles structures nous utilisons la notion de distance d'Ă©dition entre deux arbres. Cependant, au vu de certains limitations de celle-ci pour comparer des arbres reprĂ©sentant la structure secondaire Ă  un haut niveau d'abstraction, nous avons introduit une nouvelle distance d'Ă©dition qui prend en compte deux nouvelles opĂ©rations d'Ă©dition: la fusion de noeud et la fusion d'arc. A l'aide de cette nouvelle distance, nous fournissons un algorithme permettant de comparer deux RNA-MiGaLs. Celui-ci est implĂ©mentĂ© au sein d'un programme permettant la comparaison de deux structures secondaires d'ARN

    Novel tree edit operations for RNA secondary structure

    No full text
    We describe an algorithm for comparing two RNA secondary structures coded in the form of trees that introduces two novel operations, called node fusion and edge fusion, besides the tree edit operations of deletion, insertion and relabelling classically used in the literature. This allows us to address some serious limitations of the more traditional tree edit operations when the trees represent RNAs and what is searched for is a common structural core of two RNAs. Although the algorithm complexity has an exponential term, this term depends only on the number of successive fusions that may be applied to a same node, not on the total number of fusions. The algorithm remains therefore e#cient in practice and is used for illustrative purposes on ribosomal as well as on other types of RNAs
    • 

    corecore